Deploying pdf2Data Editor manually
This guide describes how to manually configure and deploy pdf2Data Editor. It requires you deep understanding of docker technology and possible pitfalls, which are out of the scope of this guide. A few obvious shortcomings of manual deployment are
- It is error-prone and time-consuming
- Not only initial installation but also any upgrade requires manually review and changing configuration files
That's why we recommend, whenever possible, following the guide "Deploying pdf2Data Editor with a helper script " instead.
Prerequisites
Since pdf2Data Editor is provided as Docker containers, we assume that you have some familiarity with containerization, particularly with Docker Compose.
To deploy and start the app, the following software must be pre-installed.
- Get Docker 19.03.0+ To verify installation please use docker --version
- Docker compose plugin 1.27.0+ To verify installation please run docker compose --version in terminal
All pdf2Data Editor components are available as images on AWS ECR (or on Docker Hub for older versions) so your system must be able to access it.
Deployment
Create Docker Compose configuration file
To get the application deployed, it is required to create a docker-compose.yml
file with the following content:
Lightweight UI
services:
pdf2data-editor:
container_name: pdf2data-editor
pull_policy: always
restart: unless-stopped
image: public.ecr.aws/apryse/pdf2data-manager-db:{version} # replace {version} placeholder with the actual app version
env_file: .env
ports: ['80:8080']
Fully functional UI (recommended)
services:
#=========== FRONTEND ===============
pdf2data-manager-frontend:
container_name: pdf2data-manager-frontend
restart: unless-stopped
pull_policy: always
image: public.ecr.aws/apryse/pdf2data-manager-frontend:{version} # replace {version} placeholder with the actual app version
ports: ['80:8080']
env_file: .env
depends_on: [pdf2data-manager-backend, pdf2data-editor]
#=========== BACKEND ===============
pdf2data-manager-backend:
container_name: pdf2data-manager-backend
restart: unless-stopped
pull_policy: always
image: public.ecr.aws/apryse/pdf2data-manager-backend:{version} # replace {version} placeholder with the actual app version
env_file: .env
depends_on: [pdf2data-manager-db]
#=========== DATABASE ===============
pdf2data-manager-db:
container_name: pdf2data-manager-db
restart: unless-stopped
pull_policy: always
image: public.ecr.aws/apryse/pdf2data-manager-db:{version} # replace {version} placeholder with the actual app version
env_file: .env
volumes: ['pdf2data-manager-db:/var/lib/postgresql/data']
#=========== EDITOR =================
pdf2data-editor:
container_name: pdf2data-editor
restart: unless-stopped
pull_policy: always
image: public.ecr.aws/apryse/pdf2data-editor:{version} # replace {version} placeholder with the actual app version
env_file: .env
#=========== GENERAL ================
volumes:
pdf2data-manager-db: null
networks:
default:
name: pdf2data-manager-network
For installation of the application with version prior 4.2.0 you'll need to drop the public.ecr.aws/
prefix for all the images so that they were downloaded from Docker Hub instead
Create an environment configuration file
There are a bunch of environment variables that can be used to configure pdf2Data Editor.
Those are grouped by purpose and have self-explanatory names with default values.
All variables need to be set in a separate file, .env
which needs to be in the same directory as the docker-compose.yml
file.
Editor page uses Apryse WebViewer and without a license the pdf file will be displayed with a watermark.
To get rid of it you need to add license value to this env PDF2DATA_EDITOR_WEB_VIEWER_API_TOKEN
Below is a sample of the .env
file:
Lightweight UI:
PDF2DATA_EDITOR_MODE=STANDALONE
PDF2DATA_EDITOR_URL=pdf2data-editor:8080
PDF2DATA_EDITOR_CONTAINER_MEMORY_LIMIT=2048
MPDF2DATA_EDITOR_JVM_MEMORY_LIMIT_MB=1848
PDF2DATA_EDITOR_WEB_VIEWER_API_TOKEN = ... # WebViewer license value
Fully-functional UI:
PDF2DATA_EDITOR_MODE=MANAGER
PDF2DATA_EDITOR_TEMPLATE_REPOSITORY_MANAGER_HOST=http://pdf2data-manager-backend:8080/api
PDF2DATA_EDITOR_URL=pdf2data-editor:8080
PDF2DATA_EDITOR_WEB_VIEWER_API_TOKEN = ... # WebViewer license value
PDF2DATA_MANAGER_BACKEND_URL=pdf2data-manager-backend:8080
PDF2DATA_MANAGER_MULTIPLE_WORKSPACES=false
PDF2DATA_MANAGER_DEFAULT_ADMIN_EMAIL=<valid email address>
PDF2DATA_MANAGER_DEFAULT_ADMIN_PASSWORD=<SOME_SECURE_PASSWORD>
PDF2DATA_MANAGER_DEFAULT_TOKEN_PRIVATE_KEY=... # ! Fill with Base64-encoded value of a random key. Minimum 512 bit keys are recommended
PDF2DATA_MANAGER_DB_NAME=postgres
PDF2DATA_MANAGER_DB_PASSWORD=postgres
PDF2DATA_MANAGER_DB_SCHEMA=manager
PDF2DATA_MANAGER_DB_URL=pdf2data-manager-db:5432
PDF2DATA_MANAGER_DB_USERNAME=postgres
PDF2DATA_MANAGER_JPA_GENERATE_STATISTICS=false
PDF2DATA_MANAGER_JPA_SHOW_SQL=false
POSTGRES_DB=postgres
POSTGRES_PASSWORD=postgres
POSTGRES_USER=postgres
PDF2DATA_EDITOR_CONTAINER_MEMORY_LIMIT=819M
PDF2DATA_EDITOR_JVM_MEMORY_LIMIT_MB=619
PDF2DATA_MANAGER_BACKEND_CONTAINER_MEMORY_LIMIT=819
MPDF2DATA_MANAGER_BACKEND_JVM_MEMORY_LIMIT_MB=619
PDF2DATA_MANAGER_DATABASE_CONTAINER_MEMORY_LIMIT=205M
PDF2DATA_MANAGER_FRONTEND_CONTAINER_MEMORY_LIMIT=205M
For more information on available settings and their meaning see Customizing pdf2Data Editor application
Run / Stop the application
To run the application it is required to run the following command in the folder which contains the above-described docker-compose.yml and .env files.
docker-compose up -d
To stop the application it is required to run the following command in the folder which contains the above-described docker-compose.yml and .env files.
docker-compose down